perm filename BES.KRD[AM,DBL] blob sn#613285 filedate 1981-09-21 generic text, type T, neo UTF8
13-JUL-81 20:39:25-PDT,23308;000000000000
Mail-from: Arpanet host MIT-AI rcvd at 13-JUL-81 2036-PDT
Date: 13 July 1981 23:32-EDT
From: Randall Davis <KRD at MIT-AI>
To: don at RAND-UNIX, lenat at PARC-MAXC


Ok gentlemen, better sit down before you read this.

Believe it or not, herewith at last are the notes for the meta-knowledge
and/or meta-cognition chapter.  I make no claims about them being incredibly
superb, but I do think they are a reasonable distillation of the ideas we
put together in that week.

Once again, I am most willing to be a particpating member of the group,
but am too snowed under to be the motivating force behind this.

Hope this helps get the book finished up.  Let me know if there is more
I can do.

cheers
Randy

*******

                           Overview of Meta-Cognition

                 Strategic, Descriptive and Systemic Knowledge


I) Introduction
	The basic approach we take is an evolutionary one.  We trace
through the evolution of a system as we add to it a variety of forms of
competence:
	problem solving performance - Can it solve the problem?
	explanatory capability - Can it explain how it solved the problem?
	evolutionary capability - Can it get better at solving the problem by
					 learning more about the domain?
	reconfiguration - Can it alter its construction and design in response to
				experience on the problem?

	As it turns out, this sequence of competences closely parallels the
real concerns that have developed historically.  Initially the only concern
was performance.  Later an expanatory capability became important, and soon after
the recognition of the importance of the ability to learn and evolve.  Most
recently interest has been expressed in capturing the competence of the expert
system designer.
	In part this also reflects a broadened perspective on what constitutes
competence.  An expert is more than someone who "merely" solves a problem.  He
is capable as well of explaining, learning, reorganizing, etc.  All of these
forms of knowledge we consider "meta-knowledge", and will explore them in the
sections that follow.
	What we hope to give you in this chapter is a set of categories that provide
an analytic framework useful for examining and thinking about problems, illuminating
and structuring knowledge about a task, and perhaps helping to point out gaps in
that structure.


2)  Meta-knowledge: what is it.

Brief description of problem domain (or will other chapters have done it
already??)

Now consider some statements from the problem description:

	(from pages 126-129)

	"If the spill is flowing continuously, direct methods based on
	backtracking are possible.

	If the flow has stopped, indirect methods usually involving chemical
	analysis and inventory comparisons are required.

	The backtracking approach is always used first if the source is
	unknown and an oil spill has reached the stream.

	If backtracking is not feasible, indirect methods based on chemical
	identification must be used.

	In general chemical analysis requires one to several days unless
	priorities for the analytical chemistry work load are shifted.

	In addition, the spill sample may have weather extensively so that
	comparison of the chemical results with standard examples may not
	produce a completely reliable result.

	There are 33 standard oil types in the ORNL inventory and roughly
	1000 different locations with a gallon or more of oil.  Thus, on
	the average there willbe 30 possible locations for the spill source.
	There are also many special purpose oils in the inventory ... which
	may make the identification a quite difficult chemical analysis problem.

And this one from one of the spill episodes (pg 180-181):

	Since this occurence was really the result of poor operating
	practice rather than an unusual occurance as might be implied by
	the oil spill regulations, it wsa not judged necessary to report
	the incident as an oil spill.


The interesting thing is that such information includes a range of advice about
/problem solving/, about the /methods available/, their /applicability/,
/justification/, /resource usage/, /reliabilty/, /utility/, etc.  In each
statement we have knowledge that helps us to use our knowledge of the domain.

It is this "knowledge about knowledge" that we refer to as meta-level knowledge.

It comes in several varieties, which we refer to as /strategic/, 
/descriptive/, and /systemic/.


Strategic:
	Includes information about control, e.g., the ordering of methods --

	The backtracking approach is always used first if the source is
	unknown and an oil spill has reached the stream.

	If backtracking is not feasible, indirect methods based on chemical
	identification must be used.

	If the spill is flowing continuously, direct methods based on
	backtracking are possible.

	If the flow has stopped, indirect methods usually involving chemical
	analysis and inventory comparisons are required.


Descriptive
	Includes information about the justification, certainty, resource 
	requirements, or accuracy of object-level knowledge --


	In general chemical analysis requires one to several days unless
	priorities for the analytical chemistry work load are shifted.

	In addition, the spill sample may have weather extensively so that
	comparison of the chemical results with standard examples may not
	produce a completely reliable result.

	Since this occurence was really the result of poor operating
	practice rather than an unusual occurance as might be implied by
	the oil spill regulations, it wsa not judged necessary to report
	the incident as an oil spill.


Systemic
	Includes information about the structure of the problem that allows
	us to determine appropriate knowledge engineering techniques
	(e.g., in this case, the size of the search space and the difficulty of
	searching it).

	There are 33 standard oil types in the ORNL inventory and roughly
	1000 different locations with a gallon or more of oil.  Thus, on
	the average there willbe 30 possible locations for the spill source.
	There are also many special purpose oils in the inventory ... which
	may make the identification a quite difficult chemical analysis problem.


Motivation:
	
As we will see, there are several reasons why this form of information 
can be useful:

	performance: control m-k helps constrain search
	flexibility: descriptive m-k makes it easier to make changes to the program

and several reasons why it tends to crop up in the first place:

	people don't talk in formal terms. E.g., when telling you about a domain 
	they don't supply complete, consistent information.  In a sense analogous
	to the need to deal with inexactness of inference at the object-level, so
	here we need a way to capture their inexactness and incompleteness of 
	domain information at the meta-level [Badly stated, but the idea is to be
	able to deal with stuff like: "this rule is mostly right, except under the
	following conditions", or "chemical analysis isn't totally reliable".]

	people tell you all sorts of things about the problem.  As the examples
	above illustrate, there is a wide variety of information offered by an
	expert when describing a domain and the problem solving techniques he uses.

	programs have all sorts of knowledge in them.  Directly or (more often)
	indirectly, almost all expert systems contain information of the sort
	illustrated above.

If it's going to be present (and it always seems to be then we should deal
with it directly.  That is, we ought to recognize it and represent it explicitly
in a language that is high level, economic in expression, and has the
appropriate primitives.  Our claim is that the program will function better and
will be easier to build and improve as a result.  Examples given throughout the
rest of this chapter help to make this point.

Our intent here is also to categorize, describe, and suggest ways of dealing
with each of the varieties of meta-level knowledge we encounter.

The next several sections of this chapter describe each of the varieties of
meta-knowledge in more detail, explaining the concept and providing examples from
the problem at hand.
3)  The problem domain and simplest architecture.
	The spill problem domain.
	The architecture of a simple, goal-directed backward chaining system,
		to be used as a starting point and straw man.

		Use spill amelioration as a task
			e.g. if spill is sulphuric acid, use anion-exhanger
			     if spill is sulphuric acid, use acetic acid

Caveat:
	Throughout we use this /rule/-based system as a focus for the discussion.
But none of the material here is constrained to a rule representation.  All
of the discussion is oriented to issues of kinds of knowledge and the appropriate
organization for that knowledge; we do not deal with issues that are particular
to a specific representation.
4)  Strateigic meta-knowledge

example:
Two rules
	if sulphuric acid, use anion-exchanger.
	if sulphuric acid, use acetic acid.

Meta-rules
	Use rules that employ cheap materials before those that employ more
	expensive materials.

	Use less hazardous cleanup methods before more hazardous methods.

Note that these rules require that the system be able to reason about the items
mentioned in the rules, in one case that acetic acid is cheap and
anion-exchanging is generally more expensive, and in the other that one chemical
may be more dangerous to work with than the other.  This has been referred to as
"content-reference".

Strategic meta-knowledge was historically one of the first forms to be
recognized, tho it was generally encoded indirectly.  By indirect, we mean that
various problem domains forced system builders to contend with it, but they
often did so with mechanisms that were added on to their system designs as
afterthoughts.  The idea has shown up in

	conflict resolution in production systems (a variety of control schemes have
		been explored, but they are generally constrained to be a
		single, global criterion employed continually)

	planner (advice lists; an indirect mechanism in the sense that it
		indicated, e.g., order information only by keeping the final
		desired order of operators, with no indication of the rationale
		for that ordering)

	Teiresias (meta-rules; a more direct mechanism, perhaps the first
		attempt to encode strategic knowledge /uniformly/ and /explicitly/.
		Used a restricted syntax dealing with order and expected utility of
		a rule.)

	NASL ("choice rules"; used to rule in a KS, rule it out, or to reformulate
		the goal.)

	Molgen (Recognized the importance of meta-knowledge in planning and developed
		a layered planning model which took explicit account of it.)


[Aside: the example above is a "local" version of the attempt to produce a
traditional sort of minimum cost solution.  Rather than trying to enforce a
global minimization, we try to keep costs down at each step, presumably in the
hope that the problem is somewhat decomposable: ie, that low costs at each step
results in low total cost.  This is not always true (economies at one point may
result in high costs later), but is a good guess.]

[Comment: I will admit to being less than objective on the point, but I believe
my 1980 AI Jnl paper on Meta-rules is a fairly complete discussion of the
"reordering" view of strateigic meta-k.  That view isn't the whole story of
course, but I think that paper says about as much on that view as is currently
worth doing.  I think Stefik's planning with constraints paper (also AI Jnl) is
a good follow-on, in the sense that it tries to put structure on the various
meta-levels.  I would suggest that a distillation of those two papers would be a
good and quick way to create this section of the chaper.]

5)  Descriptive meta-knowledge 

Two major varieties:

Knowledge about content
Knowledge about form

Within each major variety we can identify a number of different forms. 
(Lists are intended to be samplers and have no pretense of being complete.)

	Knowledge about content
		justifications (what makes you believe it?)
			(Mycin did this, but in simplest form of text strings;
			 Swartout's justifications in the form of domain principles
			 are a more substantive version)
			definition
			authority (eg, OMTADS; Mycin's journal references)
			induction (Mycin's jnl refs are also party inductive, at
				  least to the extent that you actually read the
				  journal article and understand the reasoning
				  behind the author's conclusion)
			deduction
			observation
			hypothesis
			generic principle (eg, Swartout)
		certainty (how sure are you of the conclusion?)
			(Mycin's CFs, Prospector's likelihood ratios, others?)
		reliability (how good is this method generally?)
			[Not clear if certainty and reliability are totally distinct.
			The idea is that certainty is associated with a single rule while
			reliability refers to a method in general (eg, the chemical
			analysis/weathered sample issue).]
			(Any examples?)
		resource requirements (how much effort will a method require?)
			("several days" example from oil spill)
		accuracy (how precise are the results likely to be?)
			(A computation may give you five digit's accuracy, but you
			 might suspect the method used.  Any examples?)

	Knowledge about form and representation
		syntax of rules and other knowledge-carrying structures in the
			system (Teiresias' data structure schemas [thesis, chapt 6],
			others?)
		semantics of rules
			(indirectly, Teiresias' content reference, since it recognized
			 the difference between something appearing in a rule premise
			 vs in a conclusion.)

Utility of Descriptive Meta-knowledge
	In explanation

	Traditionally, the question WHY is answered via a replay of the inference
performed.  This is a picture of "why" that sees /goal structure/ (of
the agent performing the action) as an appropriate answer.

	A number of programs have worked within this model, Mycin was problably
the first.  A slight improvement is seen in [Swartout; thesis or ijcai paper],
where explanations were still seen in terms of the goal/supergoal notion, but
the expression of the linkage between goal and supergoal was derived from the
"domain principle" behind the specific rule.  Instead of simply replaying the
exact rule used, the system determined what generic principle the rule was an
instance of, and gave that instead.  This produced somewhat more general and
often somewhat more comprehensible explanations.  While it was still following
the "retrace the performance" model of explanation, the restructuring of the
program as a collection of domain principles helped to make that performance
more understandable.

[Possibly interesting sidelight:  I claim that much of the power of Swartout's stuff
came from this restructuring.  It wasn't so much that there was a novel view of what
explanation was (ie, he was still doing performance replaying), as that the program
itself had improved quite a bit by being recast in his domain principle framework.
The automatic programming idea is a nice way to keep around the links from the
current performance program to those domain principles, but the key issue is (a) that
the program is oriented around such principles, and (b) that the appropriate links
arise somehow (they could have been editted in by hand; snazzier to derive them
automatically, but not necessary for the performance of the explanation stuff.)
Anybody agree/disagree with this?]

	Others have looked outside the goal/supergoal framework and noted that
the question can mean a variety of things (e.g, [Davis, thesis, chapt 3],
[Genesereth, "Why"]).  Given a goal/supergoal pair and the rule that relates
them (ie, the traditional replay mode of explanation) one can easily ask "why"
and mean "why is that true", ie, give me the justification for that rule.
Hence, justification is an important kind of meta-knowledge useful for
explantions.  [This is a bit late to be making this point, so maybe it should be
made earlier on.  We're trying to motivate the utility of descriptive meta-k and
one point is simply that it provides a different "dimension" of explanation.]

Descriptive meta-knowledge in the form of justifications is also useful in 
knowledge acquisition and knowledge-base evolution.

Example:

suppose you have three rules of the form

if spill is sulphuric acid, use lime.
Justification:  lime neutralizes acid and the compound that forms is insoluble
and hence will precipitate out.

if spill is acetic acid, use lime.
Just: lime neutralizes the acid.

if spill is hydrochloric acid, use lime.
Just: lime neutralizes the acid.

     Now suppose lime runs out and lye is suggested as a replacement, and we
know at the compound formed by lye and sulphuric acid is soluble 
[I believe this last claim about solubility is corect, but best check it.  We
checked this with Johnson, but my notes may not be perfect.]

What we really want to do is go through all rules an substitute lye for lime
/in those rules which use lime solely to neutralize pH/.

The point is that substitutions work only when you know /why/ the thing was
being suggested in the first place.


Knowledge acquisition and knowledge base evolution has also been aided by the
use of descriptive meta-knowledge in the form of knowledge structure syntax:

Example:  
Teiresias' use of data structure schemata.  Knowledge acquisition became a
process of filling in the schemata; the "dialogue" between system and expert was
guided by a simple "interpreter" that went thru the schema, prompting the expert
for each piece of information it needed from him.

The schemata were descriptions of the /syntax/ of the knowledge base.  The
indicated both the syntax of individual data structures and the interrelations
between them, so that both could be appropriately taken care of during
acquisition.

6)  Systemic meta-knowledge

Loosely speaking, systemic meta-knowledge is that knowledge about system
design that exists in the head of the knowledge engineer.  What is it that
a good system designer knows and puts to work in coming up with, or even
simply selecting from among such architectures as Macsyma, Hearsay, Mycin,
Harpy, AM, Dendral, etc.?

It seems to involve knowledge about
	consistency of the available knowledge
	completeness of the available knowledge
	size and shape of the search space
	desired efficiency of the system

Examples of systems that have tried to capture such information:
	Barstow-Kant on auto programming
	Programmer's apprentice work
	Goldstein-Miller on problem reformulation
	Macsyma advisor


Utility of systemic knowledge

1) System construction, obviously.  If we actually had a fair body of such
information then the production of expert systems might be automated in a
fashion far more ambitious than the hand-holding knowledge acquisition currently
possible.

2) System re-construction.  A somewhat more modest goal: can the system
reconfigure itself in response to experience and what it may have learned about
the domain.  A simple example: changing representations of facts because they
are accessed very often, or very rarely (did AM do anything like that?).

3) A new form of explantion.
Consider a program that's in the midst of the square root calculation.  If we
interrupt it and ask what it's doing, the answer might be that it was dividing
two numbers.  If we ask why, the answer is "in order to get the square root, via
Newton's method."  A second "why" might mean either, why are you computing a
square root, or why via Newton's method.  In the second case, we go outside the
goal/supergoal framework of traditional explanation to explain why one of a
number of design choices was made.

7)  Another Approach

	I'm not totally satisfied with this outline for two reasons:

1) it doesn't quite follow the suggestion made at the outset of following
the development of
	problem solving performance
	explanatory capability
	evolutionary capability (knowledge acquisition)
	reconfiguration
Rather, it's organized around strategic, descriptive, and systemic meta-k.
That's a good indexing, but so is the one that deals with those four
capabilities.  Problem is they don't quite match up.

As a result, we point to, eg, explanation, several times in the outline in
different sections, since various forms of meta-k have their impact.

2) It still seems to leave at least one dangling issue:  meta-k in planning.
Possibly this fits with the strateigic stuff.  I tend to think of it as putting
more meat on that particular set of bones.  The original conceptions of
strategic k are mostly ordering things; the levels of planning a la Stefik tries
to address the issues at the level of what kinds of things one should think
about at each level.  Ordering info follows from that, but I think there's more
there than simply ordering info.

Suggestion:

It might be useful to do this chapter two ways.  The original formulation still
seems reasonable to me, but maybe we also want to do the "task oriented"
approach (performance, explanation, ka, reconfig).  Where the mapping is clear
(ie, the task of configuration seems to require only systemic m-k), little
need be said, but there are places where the two indexing schemes (by variety
of meta-k and by task) don't match up.  Using just one of those schemes seems
to leave gaps.  

I think the biggest gap is in the expanation stuff.  In particular, Genesereth's
"Why" paper points out an interesting range of what that question can mean.  The
list below is derived from that paper, with some additions and a little
reorganization:
	the goal-subgoal replay approach (Winograd, Mycin)
	the abstracted goal-subgoal replay approach (Swartout)
	the "why is that true" question.  Ex: mycin answers with its
		rule, and "why" means, but why is that rule valid?
		(mentioned earlier under justification)
	why does that accomplish the goal (teleology; how is it that that
		action contributes to the goal).  Seems to require deeper knowledge
		of the domain.
	why is that a good way to accomplish that goal (the design space
		question).  Seems to require systemic meta-k about design space,
		efficiency, etc.

Might want to do this for the other tasks
	peformance: systemic meta-k, strateigic meta-k
	ka: covered in its own chapter, but might want to consider the various
		forms of meta-k that contribute to it.  One motivating question:
		How is knowledge acquisition a knowledge-based task?
	reconfiguration: mostly systemic meta-k.

Also, we haven't talked about CAI much.  Clancey's work is mostly on how
teaching is a knowledge-based task.  What form of meta-k is involved there?
what examples do we have?

Should we add this as a final task and talk about it as well?
It's important enuf and I think even fits well in our "historic" approach,
in that it was indeed the most recent capability added to expert systems.